BENTON HARBOR POWER PLANT LIMNOLOGICAL STUDIES 



PART XX. STATISTICAL POWER OF A PROPOSED METHOD 
FOR DETECTING THE EFFECT OF WASTE HEAT 
ON BENTHOS POPULATIONS 



Edward M. Johnston 



Under contract with: 

American Electric Power Service Corporation 
Indiana and Michigan Electric Company 



Special Report No. 44 
of the 
Great Lakes Research Division 
University of Michigan 



December 1974 



TABLE OF CONTENTS 

Page 

LIST OF TABLES il 

ABSTRACT iii 

INTRODUCTION 1 

METHODS AND RESULTS 1 

Significance and power 1 

Expected mean squares for Model A 3 

Estimation of the sampling error variance 7 

Estimation of the interaction variance 6A^' 18 

Calculation of the least detectable true change £ 24 

Derivation of a least detectable true ratio in terms of <$" . . . 26 

Evaluation of cJ using the variances determined previously ... 28 

REFERENCES 29 



LIST OF TABLES 

Table Page 

1. Linear model and expected mean squares for Design A, 

a nested analysis of variance ..... . . 5 

2. Dates of the surveys analyzed in this report 7 

3. The DC, NDC and SDC stations 8 

4. Estimation of the sampling error in the random survey 10 

5. Data used in the ANOVAs of Tables 4 and 8 11 

6. Date for the estimate of sampling error in the grid survey . . 15 

7. Estimation of the sampling error in the grid survey 19 

8. Estimation of the interaction variance Qay 21 

9. Linear model and expected mean squares for Design B, 

a factorial analysis of variance 23 



11 



ABSTRACT ~ 

In Part 18 of this report series, a five-way analysis of variance was pro- 
posed to assess the effect of waste heat on benthos populations near the Cook 
Plant. The complete period of study would be 1971 through 1978. Ultimately, 
four years of preoperational and four years of operational data would be avail- 
able for the analysis. This report calculates the expected sensitivity of the 
proposed ANOVA design. 

An equation given by Sokal and Rohlf (1969) can be used to express the 
least detectable true difference between two treatment means as a function of 
the following: the significance level of the test, the statistical power (the 
desired probability that a true difference will be detected) , the degrees of 
freedom for error, the number of observations per treatment, and the error 
standard deviation. In this report the significance level is set at 5% and 
the power at 95%. These requirements lead to an estimate of the smallest true 
change in the benthos that is detectable by the existing sampling program. 
This is expressed as a change due to plant operation in the ratio of the true 
mean population at the inner stations (near the outfall) to the mean at the 
outer stations. It turns out that this ratio would have to increase or de- 
crease by a factor of 5.48 for a change to be detected by the planned ANOVA. 

An important step in the calculation is to find an estimate 6 for the 
error standard deviation 6 . This will be the square root of the mean square 
used as the denominator for the F-test. In the present design, which is a 
mixed model nested ANOVA, t1he test for a heat effect is the test of the inter- 
action of the construction time and outfall distance factors. The appropriate 
denominator can be found using a table of expected mean squares. The square 



root of this denominator gives I' ^ "'^ ^"^ ^PY ^® ^^^ value for <f , where 
(S is the within-cell error (sampling error) and (^. ^ is the variance of 



111 



the model terp for ^he Interaction of year and outfall distance. Data already 

fin 

Two features pf this model are distinctive: (1) the nesting of the year 



taken are used to obtain th^ following estimates: O^ ■» .49527, and ^^^^ .OlOOC 



factor, and (2) th^ use of a non--zero estimate of the interaction variance ^ . 
While these choices reduce the predicted power, they probably lead to a more 
realistic model. 



iv 



INTRODUCTION 

In a previous report in this series (Johnston 1973) , two methods were pro- 
posed for assessing the effect of waste heat on benthos populations near the 
Cook Plant. The first of these was a graphical technique. The second was a 
five-way analysis of variance (ANOVA) that would require four years of pre- 
operational data and four years of operational data. The present report is 
concerned with the second method. It will give a calculation of the least 
change in the benthos that is detectable with 95% power by the proposed anal- 
ysis of variance. 

METHOD AND RESULTS 

Significance and Power 

In this report, the level of significance a used will be .05 in the sta- 
tistical test for the presence of a heat effect. This means that we accept 
a 5% probability of a Type I error. Such an error would be a conclusion that 
a certain biological paramete:^ had changed when there had in fact been no 
change. In the analysis of variance there is also a risk of a Type II error, 
that is, that a true change in the biological parameter would not be found 
significant by the test. The probability of a Type II error is g, and the 
statistical power P is equal to 1-6. In general, for a given level of a and 
a given true change, the power of a test increases with the number of rep- 
licates. In this report we will require that P =.95 and compute the true 
change in the benthos population for which this power will be achieved. An 
equation relating these quantities is as follows (derived from an equation of 
Sokal and Rohlf 1969, p. 247): 



where 

6 » least detectable true difference 

a » true error standard deviation 

V = degrees of freedom of the error mean square 

n ■ number of observations at each of the two treatment levels 

t =» Student's t 

a » significance level 

P * power (the desired probability that a difference will be 
found to be significant) 

An example to which this formula could be applied is the following. The means 
of two treatments are to be compared where there are n observations per treat- 
ment and a is the true standard deviation of the observations about their re- 
spective treatment means. The test could be either a two-sample t-test or a 
one-way ANOVA. If we set a = .05 and P = .95, then 6 is the minimum amount by 
which the true treatment means must differ if there is to be 95% probability 
that the means of two samples of size n will be found significantly different 
at the 5% level. 

The next step is to relate the above formula to the eight-year experi- 
mental design proposed in Part 18 (Johnston 1973) . It was stated that the 
statistical test for a heat effect on a given taxon would be the F-test of the 
interaction of the outfall distance and construction time factors. The for- 
mula stated above can still be applied, since this interaction has only a 
single degree of freedom. In such a case, the F-test and the t-test give 
equivalent results ( ^jr^ y) ' ^^ (v))* ^^^ value used for a will be the square 
root of the expected error mean square of the ANOVA. 

The distinction of inner and outer stations was made earlier in Part 18. 
The idea was used in previous work on Lake Michigan benthos, in connection 
with the Palisades plant (Beak Consultants 1973). It is worth restating why 



such a plan may be desirable. Stations far from the plant can serve as control 
or reference stations. If ambient lake conditions change, then stations of 
equal depth near to and far from the outfall should be similarly affected. By 
using the difference between inner and outer stations, we should be working 
with a quantity that is relatively less affected by ambient lake changes. It 
should be noted that we use the difference between the transformed population 
densities (y = log(x-l-l)) in the inner and outer areas of each depth zone, av- 
eraged over all months, zones and years. (x is the actual benthos density in 
animals per square meter; y is the transformed density.) 

Elliott (1971) favors the use of the logarithmic transformation with ben- 
thos data. "As the variance of a bottom sample is often greater than the arith 
metic mean, log (x+1) is probably the most useful transformation" (op. cit. 
p. 32). 



Expected Mean Squares for Model A 

To determine the power of the test for a heat effect it is necessary to 
find the appropriate denominator for the F-ratio. In a factorial ANOVA with 
all effects fixed, this would be a simple step since the within-cell mean 
square (MS) would be used as the denominator in all F-tests. The present 
ANOVA is different since the year factor (4 levels) is nested within the con- 
struction time factor (2 levels). The relation of these two factors is as 
follows : 

1971 1972 1973 1974 1975 1976 1977 1978 

HBefore 
Construction time] 

(After 

Data will be available only for the combinations marked with an X. This de- 
sign is a mixed model, since it contains both fixed and random factors 
(Scheffe 1959, p. 6). 




The quantity we are using in the test for a heat effect is the interaction 
of outfall distance and construction time. To set up the F-ratio for this test, 
the table of expected mean squares is needed. This is given, along with the 
model equation for the eight -year design ("Design A"), in Table 1. The ex- 
pected mean squares were derived from the linear model using a procedure giv- 
en by Kirk (1968, p. 208). 

In Table 1, notice that a number of interactions have been omitted. The 

omissions result in a model that assumes additivity for the effects of season 

and of depth zone. This choice was made for simplicity and because of the 

difficulty of assigning an ecological interpretation to further interactions. 

The nine terms that have been included in the linear model are those that we 

expect to be non-zero based on our general understanding of the system. To 

include all interactions would increase the number of terms in the linear 

model from nine to 32 and would lead to unpleasant formulas for the expected 

mean squares. Some remarks of Kirk may help explain our decision: 

"The selection of an experimental design and associated linear 
model is largely based on an experl enter's subject-matter knowledge.... 
The model selected should include all sources of variation that the 
experimenter is interested in and that are expected to contribute 
significantly to the total variation. In reality, all sources of 
variation not specifically included in the model as treatment effects 
become a part of the experimental error." (Kirk 1968, p. 214.) 

The term that we use to test for the heat effect is n. , . The subscripts 
a and y refer to the construction time factor and the outfall distance factor, 
respectively. The null hypothesis is that ^ = 0, i.e. that the startup of 
the plant has no effect on the difference between the mean transformed ben- 
thos density at the inner stations and the same quantity at the outer 
stations. Table 1 shows that the expected mean square due to the CD inter- 
action contains D : MS„ = <J! + 108 0_ + 21 (fj\ . The expected mean square 



TABLE 1. Linear Model and Expected Mean Squares for Design A, a Nested Anal- 
ysis of Variance. The notation ^Ul) IndicateB that the factor indexed by j 
is nested within the factor indexed by i. This convention is used by Kirk 
(1968, p. 230). Note that for depth zone and season only main effects are in-^ 
eluded. Their interactions are neglected. To allow for this the error de- 
grees of freedom shown below have been increased by the appropriate amount. 



Greek No. of 

Factor S\mibol Name Levels Type 

C oC Construction time 2 (before, after) fixed 

Y ^ Year A random (nested within C) 

D y Outfall distance 2 (inner, outer) fixed 

S S Season 3 fixed 

Z ^ Depth zone 3 fi^ced 

E £ Error 3 (no. of replicates) 

No. of observations =2x4x2x3x3x3 = 432 
Total degrees of freedom = 431 

Source of Degrees of 

Variation Freedorm £>:pected Mean Square 

c 1 cf(-"+ 2J.t 6'^ ■+ 5i ^^ 

Y 6 ^c"^ + ^'i ^^ 

s 2 dj-/ + /'if e] 

Z 2 ^.'^ + /'^'^ ^l 



CD 



1 ^r,'^+ [oi e;-^ + xi ^^y 



YD 6 /^ ^ 4- n S^y 



Error 412 x" '^ 
. ^c 



Total 431 



(cont. ) 



TABLE 1 cont. 

In what follows, c is the number of levels of factor C, y is the number of 
levels of Y, and so on. 

Since construction time is a fixed factor, Q, is the mean square amplitude 
of the effects o<; . 

c 

c - / 

Since year is a random factor, ^^ is the variance of the population from 
vhich the effects ^^V;; are drawn. Similarly, ^' is the variance of the 
population from which the interaction ef f ects ^ * ^ -ii^ are drawn 

The other ^$ and £/ ^ are defined in a parallel way. The symbol ^ 
is used whenever one of the subscripts is ^ , indicating\hat the (randc 

year factor is involved. Otherwise is used. This notation follows 
that of Myers (1972, p. 188). 



Similarly: 



for YD contains two of the above three terms: MS = C?" ^ + 27 <5^ . The two 

YD ^ p-^ • 

mean squares .. MS^^^ and MS^jj . can thus be used in an F-ratio to test the null 
hypothesis that ^ = 0. ^ 






n 



X ' »-> YD * /y.y 

The quantity ^ measures the variability among replicates; the quantity 6'" 
measures the year-to-year variability of the difference between the inner and 
outer populations, averaged over all months and depth zones. Consider the 
null hypothesis that ^ = 0. The F-ratio then becomes ((^/-f-ayrf-^ )/r^ +;7dr' 
Under the null hypothesis the F-statistic will be distributed as tabular F(l,6) 



Since (^ + 27 (T^ is used as the denominator, the quantity o = \[^^7llar 



is the appropriate error standard deviation to use in the power equation. Th< 

next step in calculating power is to get estimates for what dT and 6 are 

likely to be in the complete experiment, using the portion of the data that ii 
already available. 

Estimation of the Sampling Error Variance 

The quantity to be determined is S^ , which we will also refer to as the 
sampling error. Table 2 shows the surveys to be used in finding variance es- 
timates. Surveys from July 1970 through April 1972 were taken with a samplin 
grid that is shown in Table 3. Beginning in July 1972, a survey plan was 
adopted in which station locations were chosen randomly within each area of 
interest. Details of this procedure were given by Mozley (1973). For con- 
venience, the calculations that follow will assume that the random survey pla 
will be continued for the remainder of the eight-year experimental period 
(1971-1978). Under this plan, four depth zones are sampled from each of thre 
regions. The D region is adjacent to the plant, the N region is 7 miles nort 

TABLE 2. Bates of the Surveys Analyzed in this Report. 



Year 

1970 July September November 

1971 April July September November 

1972 April July* October*t 

1973 April* 

*The starred dates had random surveys; the others had 
grid surveys. Mozley (1973) gives a description of 
the random survey; stations used in the grid survey are 
displayed in Table 3. 

+The survey taken in late October 1972 has been grouped 
with the November surveys of 1970 and 1971 for purposes 
of analysis of variance. There was no major survey in 
September 1972. 



Q 



CO 
0) 
•H 
U 
U 

w 






o o 

^ CD 

U 

CO cd 

s o 

CO 



o 



•H 

0) 
■P 
cd 

a 



>^ -H 



M 
0) 

03 

I 

U 

o 
c 

4-» 
CO 
Q) 



T3 



CO 
CO 

4^ 



O cd 

a 

•H 



Cd 
o 

0) 
4-» 



(1) 

(1) 
> 
Cd 

CO 
(U 

o 
cd 
en 



CO T3 



u 



o ^ 
u o 






o 

c 
Cd 

4-i 
CO 

•H 

0) 

H 



CO 
M 

CU 

4J 

0) 

e 

o 



w 



•H O 



en 



0) 

C 
o 

CO 



T3 

Cd 

CO 
<D 



CO 

C 
o 

cd 00 

en •> rH 

13 rH 



4J a. 'H 



O 

CO 



U CO 

o 

O P-t 

u 






in 

CX) 

in 



^ M-l o 



CO 

o 

4^ 
CO 

Eh 



CO - 

cd ^ 

n3 . 
0) 
U 

CU T 

U 
CU 



cd 
u 
cd 

CU 

cd 






o 

r-\ 
CO 

CU 

rH 

o 
a 



^-4 

o 
o 

CO 



4J 

o 

CO 

cd 

W H 



CO 



u 

CU 
CU 

CO 

CU 

o 
a 
cd 



lU CO 



o 

a 

(U 
(U 

> 

W) 

CO 
•H 



o 
u 



4-» -H cd 



00 u 
C CU 



5 



H -H 



a 



cd ^ 

00 CU o 



CU 



15 4J qu 



o o 



O O 
<f O 



^ <f 



o in 



goo 
o ^ o 
u 



in 
I 

I 
u 

Q 
en 



I 

u 
p 

en 



m 
I 

o 

en 



o in 



rH O 



O O 

00 m 



o o 



o in 
<r CM 



o o 



o o 
o o 



o o 



M-^ 



I 

en 



I 

I 

O 
Q 
en 



o 
o 



I 



o 



I 

I 

Q 
en 



m 

I 

<r 

I 

o 
o 

en 



eg 

I 

Q 
en 



I 
I 

o 

en 



o 

I 

I 

u 

Q 
en 



o 
o 



I 



o 



I 



•K 

<r 
I 

CM 
I 

U 
Q 
en 



m 

CN 

I 

u 

Q 
en 



CN 

eg 

I 

CJ 

o 
en 



CN 

I 

p 

en 



o 
I 

CJ 

p 

en 



o 
o 



I 



o 

CN 



CO 
I 



m 
I 

iH 
I 

CJ 
P 

en 



CN 

I 

rH 

I 

o 
p 

en 



CJ 

P 
en 



I 

P 
en 



o 
o 



o 



4C 

m 

I 

in 

r 

u 
p 

en 



I 

CJ 
P 



in 
I 

u 
P 



p 



CO 

I 

CJ 

p 



<r 

I 

CN 

I 

U 
P 



•K 

CO 

I 

in 

r 

CJ 

p 
;3 



I 

in 
eg 



CJ 

P 
en 



CN 
I 

CJ 

p 



in 

CN 

r 

o 
p 
^3 



CN 
I 

in 

r 

o 

p 

en 



I 
m 

I 

CJ 

p 
en 



I 

CJ 

P 
en 



o 
in 



o 

I 



o 

00 



o 

I 



I 
p 



in 
eg 



o 

I 



o 



o 
I 



o 
o 



o 
o 



o 



in 

CN 



o 



eg 

I 

in 

r 

CJ 

p 



I 

in 

r 

CJ 

p 

53 



o 

I 

in 

I 
u 



o 
in 



o 

00 



•K 

CO 
I 

iH 

I 
CJ 



CO 
I 

I 
CJ 



eg 

I 

rH 
I 

CJ 
P 



u 



o 
I 



I 

CJ 



o 
o 



o 



CsJ 

I 
eg 

U 
P 



I 

CN 

I 

CJ 
P 



o 
I 

eg 

I 
CJ 

p 



o 
o 



o 
eg 



I 

<r 

I 

CJ 



CO 

I 

>^ 
I 

CJ 

p 



•JC 

CN 

I 
<t 

I 

CJ 

p 



I 

I 

CJ 



o 

I 

I 

CJ 



o 
o 



o 



in 
I 

I 

CJ 

P 



I 

I 
P 



I 

CJ 

P 



CN 
I 

I 

u 
p 



I 

CJ 
P 
55 



eg 

ON 



>^ 
'^ 

CU 

a 

CO 
CO 

>^ 

CU 

> 
u 

CO 

u 
o 

Cd 



T3 
CU 
CO 

CU 

CU 



o 
o 



o 
eg 



O 
CU 

% 

CO 
•H 
U 
CU 
4J 
CO 

Cd 

a 
cd 



^3 
CU 

u 
cd 

CO 

C 
o 

•H 
4-i 

Cd 

4J 

en 



and the S region is 7 miles south. For purposes of this report, the inner 
stations are those in the D region and the outer stations are those in the S 
region. Mozley's zone (0-8 m) is here designated as A, zone 1 (8-16 m) as 
B, and zone 2 (16-24 m) as C. Zone 3 (greater than 24 m) has not been inclu- 
ded in this analysis because it is too far from the outfall for an inner /out- 
er distinction to have much meaning. 

It is worth explaining what is meant by "replication" in the random sur- 
vey. We consider the three randomly placed stations within a given zone to 
be replicates, and that each one estimates the zone mean. At each station of 
zones B and C, three casts were made using a three-chambered grab. The con- 
tents of chamber #1 of each grab were retained and counted. The animal densi- 
ties per square meter obtained in these separate casts were simply averaged 
together to get a single estimate of the density of animals at that station. 
Since three one-third grabs were taken at each station, the total area sam- 
pled was the same as what would have been obtained with a single cast of a 
whole grab. However, in considering any results from a whole grab, the pos- 
sibility of correlation of the catches from the three chambers should be kept 
in mind. 

The field procedure followed in zone A was different from that used in 
zones B and C. Three stations were taken in each zone of each region, but 
at each station five casts were made and five whole grabs were counted. To 
obtain a value for use in this report, one of the five grabs at each station 
was randomly selected. The advantage of this method is that the resulting 
effective sample in zone A is one whole grab, just as it is in zone B and 
zone C. Thus, data from the three zones can be combined to get a single es- 
timate of the error variance. 



This variance can be obtained using a three-way factorial ANOVA, outfall 
distances x depth zones x months C2 3c 3 x 3) , with 3 replicates in each cell. 
Before analysis, the density of ani-mals at each station was transformed using 
y « log (x+1). See Table 4 for -the results. It shows the error variance as 
0.49527. The results of the usual significance tests are shown in the table, 
but they are not of interest for the present purpose, which is the estimation 
of variances. The data used for this ANOVA are shown in Table 5 under the 
heading "year 3." 



TABLE A Estimation of the Sampling Error in the Eandom Survey. The method 

used was a three-vay factorial analysis of variance of three selected random 

surveys. The data vere total animals per meter2, transformed through y = log 
(x+1). 



Factors 

1. Outfall distance (2) 

2. Depth zone (3) 

3. Month (3) 



inner, outer 

A (0-8m), B (8-16m), C (16-24m) 

July/72, Oct./72, April/73 



Vithin each cell, three stations v re used as replicates. 
For details see Table 5 under "year 3." 



Mean 
squares 

.82140 
10.80035 
.23015 
.01667 
.51A31 
.6151A 
months 4 .18806 .04702 

.49527 



Source 


of 


variation 


Outfall 


distance 


Depths 
Months 
O.D. X 
O.D. X 


depths 
months 


Depths 
O.D. X 


X months 
depths X 


Error 





Total 



Degrees of 


Sums of 


freedom 


squares 


1 


.821A0 


2 


21.60070 


2 


.46030 


2 


.03333 


2 


1.02862 


A 


2.46057 


A 


.18806 


36 


17.82963 


53 


A4. 42261 



F-ratlo 

1.66 
21.81** 

.46 

.03 
l.OA 
1.2A 

.09 



** Significant, p<.01 

F. 05(1, 36) = ^-11. F. 05(2, 36) = 3-26, F,o5(A.36) = 2.63 



10 



TABLE 5. Data Used in the AWVAs of Tables 4 and 8. 

Columns headed by 'be" give the actual densities in animals per square meter. 
Columns headed by 'y* give the corresponding transformed values (y = log(x+l) 
See Table 8 for an analysis of variance of these data. The input to that 
ANOVA consisted of the means shown below, one for each month-zone combination 
Data for Year 1 and Year 2 were taken using the grid survey. The means for 
those years are of the transformed densities from two selected stations. 
Data for Year 3 were taken using the random survey. The means for that year 
are means of the transformed densities from three stations. See the text for 
an explanation of this procedure. 

At each station of the grid survey, two casts were made. The contents of the 
two whole grabs were physically combined before counting. At each station of 
the random survey for zones B and C, three casts were made using a three- 
chambered grab. The contents of chamber no. 1 of each cast were retained 
and counted. The density shown below for each station of zones B and C 
is the mean of the untransformed densities of the three casts. At each 
station of zone A of the random survey, five casts were made and five whole 
grabs were counted. The density shown below for each station of zone A was 
obtained by randomly selecting one of the five grabs. 



Inner - Year 1 (grid survey) 



Julv/70 



Zone 


Station 
NDC-.5-1 


X 


V 


A 


139 


2.1461 


A 


SDC-.5-1 


226 


2.3560 




Mean 




2.2511 


B 


NDC-1-2 


3,006 


3.4781 


B 


SDC-1-2 


4,980 


3.6973 




Mean 




3.5877 


C 


NDC-1-3 


10,084 


4 . 0037 


C 


SDC-1-3 


5,755 


3.7601 




Mean 




3.8819 



Nov. 


/70 


April /71 


X 


y 


X 


V 


50 


1.7076 


198 


2.2989 


893 


2.9513 


144 


2.1614 




2.3295 




2.2302 


1,363 


3.1348 


4,059 


3.6085 


2,483 


3.3952 


36 


1.5682 




3.2650 




2.5884 


4,241 


3.6276 


1,993 


3.2997 


17,326 


3.2387 


795 


2.9009 



3.9332 



3.1003 



(cont. ) 



11 



TABLE 5 (cont.) 



Outer - Year 1 (grid survey) 



Zone Station 



A 


SDC-4-1 


A 


SDC-7-1 




Mean 


B 


SDC-4-2 


B 


SDC-7-2 




Mean 


C 


SDC-4-3 


C 


SDC-7-5 




Mean 



Julv/70 


X 


V 


103 


2.0170 


120 


2.0828 




2.0499 


1,207 


3.0821 


1,519 


3.1818 




3.1319 


4,466 


3.6500 


6,587 


3.8188 



Nov. 


,/70 


X 


V 


16 


1.2304 


69 


1.8451 




1.5377 


1,971 


3.2949 


1,268 


3.1035 




3.1992 


5,267 


3.7216 


9,432 


3.9746 



April /71 
JL_ —1. 



18 
18 



686 
869 



5,835 
7,848 



3.7344 



3.8481 



1.2788 
1.2788 
1.2788 

2.8370 
2.9395 
2.8883 

3.7661 
3.8948 
3.8305 



Zone Station 



A 


NDC-.5-1 


A 


SDC-.5-1 




Mean 


B 


NDC-1-2 


B 


SDC-1-2 




Mean 


C 


NDC-1-3 


C 


SDC-1-3 




Mean 



Inner - Year 2 (grid survey) 
Julv/71 Nov. 771 



1,793 
9,244 



2,808 
1,793 



815 2.9117 

253 2.4048 

2.6583 



3.2358 
3.9659 
3.6009 

3.4486 
3.2538 
3.3422 



72 

1^2 



2,862 
741 



7,376 
18,708 



1.8633 
2.2122 
2.0377 

3.4568 
2.8704 
3.1636 

3.8679 
4.2721 
4.0700 



April /72 



X 



18 
1,177 



325 
3,424 



959 
5,128 



1.2788 
3.0711 
2.1750 

2.5132 
3.5347 
3.0239 

2.9823 
3.7100 
3.3462 



(cont.) 



12 



TABLE 5 (cont.) 



Outer - Year 2 (grid survey) 







Julv/71 


Nov. 


iZl 




April /72 


Zone 


Station 
SDC-4-1 


X 


V 


X 


V 


— 


X 


Y 


A 


416^ 


2.6201 


144 


2.1614 


344 


2.5378 


A 


SDC-7-1 
Mean 


543 


2.7356 
2.6778 


0^ 


0.0000 
1.0807 




326 


2.5145 
2.5262 


B 


SDC-4-2 


1,448 


3.1611 


397 


2.5999 




777 


2.8910 


B 


SDC-7-2 
Mean 


869 


2.9395 
3.0503 


543^= 


2.7356 
2.6677 




578 


2.7627 
2.8269 


C 


SDC-4-3 


12,906 


4.1108 


8,554 
506^ 


3.9322 


6 


,487 


3.8121 


C 


SDC-7-5 


10,241 


4.0104 


2.7050 


10 


,875 


4.0365 




Mean 




4.0606 




3.3186 






3.9243 



^July/71: no data for SDC-4-1. The count for NDC-4-1 was used instead. 
bNov./71; SDC-7-1: no data. Used NDC-4-1. 
CNOV./71; SDC-7-2: no data. Used NDC-4-2. 
%ov./71; SDC-7-5: no data. Used NDC-4-3. 



Inner - Year 3 (random survey) 







Julv/72 


Oct. 


./72 


ADril/73 


Zone 


Station 
D01,04,07^ 


X 


V 


X 


Y 


X 


V 


A 


6,038 


3.7810 


2,285 


3.3591 





0.0000 


A 


D02,05,08 


734 


2.866J 


4,162 


3.6194 


7,344 


3.8660 


A 


D03,06,09 


143 


2.1584 


20 


1.3222 


20 


1.3222 




Mean 




2.9352 




2.7669 




1.7294 


B 


Dll,14,17 


1,293 


3.1119 


930 


2.9869 


2,828 


3.4516 


B 


D12,15,18 


14,200 


4.1523 


969 


2.9868 


3,050 


3.4844 


B 


D13,16,19 


4,566 


3.6596 


6,322 


3.8009 


3,171 


3.5013 




Mean 




3.6413 




3.2522 




3.4791 


C 


D21,24,27 


10,544 


4.0230 


7,797 


3.8920 


20,382 


4.3093 


C 


D22,25,28 


19,028 


4.2794 


18,281 


4.2620 


3,515 


3.5460 


C 


D23, 26,29 


1,838 


3.2646 


17,008 


4.2307 


4,667 


3.6691 




Mean 




3.8557 




4.1282 




3.8415 



See footnote g, next page. 

^This notation means that DOl was used in July/72, D04 was used in Oct./72, 
and D07 was used in April /73. For the meaning of the station names, 
see Mozley (1973), 



(cont. ) 



13 



TABLE 5 (cont.) 



Outer - Year 3^ (random survey) 







July/72 


Oct. 


./72 


April /73 


Zone 


Station 
501,04,07 


X 


V 


X 


y 


X 


y 


A 


571 


2.7574 


122 


2.0899 


204 


2.3118 


A 


502,05,08 


1,306 


3.1163 


1,306 


3.1163 


2,652 


3.4237 


A 


503,06,09 


836 


2.9227 


224 


2.3522 


61 


1.7924 




Mean 




2.9321 




2.5195 




2.5093 


B 


511,14,17 


2,728 


3.4360 


1,658 


3.2198 


61,570 


4.7894 


B 


512,15,18 


5,293 


3.7238 


2,485 


3.3955 


7,576 


3.8795 


B 


513,16,19 


16,988 


4.2302 


3,030 


3.4816 


3,312 


3.5202 




Mean 




3.7967 




3.3656 




4.0630 


C 


521,24,27 


5,494 


3.7400 


8,848 


3.9469 


14,665 


4.1663 


C 


522,25,28 


13,393 


4.1269 


24,159 


4.3831 


32,704 


4.5146 


c 


523,26,29 


27,391 


4.4376 


19,028 


4.2794 


24,826 


4.3949 



Mean 



4.1015 



A. 2031 



4.3586 



The data of Year 3 were subjected to analysis of variance to estimate th 
sampling error of the random survey (Table A), The input to that ANOVA 
consisted of the transformed densities at the individual stations^ 
rather than the zone means of the transformed densities, as in the ANOVA 
of Table 8. 



It is worth considering whether the data obtained using the grid survey 
(July/70 through April/72) show an error variance that is similar to the 
value just obtained from the random survey results. An answer can be obtain 
ed by dividing the grid into depth zones and choosing inner and outer groups 
of stations within each depth range. Such a grouping was described in Part 
18. Table 6 presents a slightly different grouping, using 30 of the 46 
stations. This arrangement puts an equal number of stations in each group 
and avoids those stations for which there was a lot of missing data in the 
months being studied. The five inner stations needed for each zone were 



14 



TABLE 6. Data for the Estimate of Sampling Error in the Grid Survey. 

Columns headed by 'be" contain the actual densities in animals per square 
meter. Colximns headed by **y" contain the corresponding transformed values. 
The transformation is y = log(x+l). "ND" means "no data". See Table 3 foi 
the analysis of variance of this data. The five replicates within each 
cell are the densities at five individual stations, selected from those 
stations in the grid that lie within the given depth range and have the 
appropriate outfall distance (inner or outer). The inner stations are tho£ 
that lie on one of the seven innermost transects (SDC-1, SDC-.5, DC, NDC-.f 
NDC-1); the others are classified as outer. The sample at each station 
was the combined contents of two grabs. The data given here previously 
appeared in Table 10 of Ayers et al. (1971) or in Tables 38-44 of Mozley 
(1973). 



Inner - 1970 



July 



Sept, 



Nov> 



Zone 


Station 
DC-1 


X 


y 


X 


V 


X 


V 


A 


962 


2.9836 


ND 


ND 


ND 


ND 


A 


NDC-.5-1 


139 


2.1461 


372 


2.5717 


50 


1.7076 


A 


SDC-.5-1 


226 


2.3560 


1,727 


3.2375 


893 


2.9513 


A 


NDC-1-1 


521 


2.7168 


502 


2.7016 


416 


2.6201 


A 


SDC-1-1 


278 


2.4440 


77 


1.8921 


2,268 


3.3558 


B 


DC-2 


2,519 


3.4014 


2,493 


3.3969 


3,640 


3.5612 


B 


NDC-.5-2 


ND 


ND 


2,840 


3.4535 


2,371 


3.3751 


B 


SDC-.5-2 


1,023 


3.0102 


1,260 


3.1007 


1,850 


3.2674 


B 


NDC-1-2 


3,006 


3.4781 


27,681 


4.4422 


1,363 


3.1348 


B 


SDC-1-2 


4,980 


3.6973 


3,025 


3.4809 


2,483 


3.3952 


C 


DC-3 


7,319 


3.8645 


5,675 


3.7540 


2,465 


3.3920 


C 


NDC-.5-3 


5,875 


3.7691 


2,450 


3.3893 


6,761 


3.8301 


C 


SDC-.5-3 


5,154 


3.7122 


4,544 


3.6575 


3,042 


3.4833 


C 


NDC-1-3 


10,084 


4.0037 


17,465 


4.2422 


4,241 


3.6276 


C 


SDC-1 -3 


5,755 


3.7601 


17,379 


4.2400 


17,326 


4.2387 



(cont. ) 



15 



TABLE 6 (cont.) 



Inner - 1971 







Ju^y 


Sept. 


Nov. 


Zone 


Station 
DC-1 


X 


V 


X 


V 


X 


y . 


A 


ND 


ND 


ND 


ND 


126 


2.1038 


A 


NDC-.5-1 


815 


2.9117 


1,159 


3.0645 


72 


1.8633 


A 


SDC-.5-1 


253 


2.4048 


1,703 


3.2315 


162 


2.2122 


A 


NDC-1-1 


1,232 


3.0910 


634 


2.8028 





0.0000 


A 


SDC-1-1 


869 


2.9395 


815 


2.9117 


162 


2.2122 


B 


DC-2 


3,623 


3.5592 


4,130 


3.6161 


1,921 


3.2838 


B 


NDC-.5-2 


2,736 


3.4373 


3,334 


3.5231 


597 


2.7767 


B 


SDC- . 5-2 


1,829 


3.2625 


2,971 


3.4730 


2,752 


3.4398 


B 


NDC-1-2 


1,793 


3.2538 


1,593 


3.2025 


2,862 


3.4568 


B 


SDC-1-2 


9,244 


3.9659 


7,521 


3.8763 


741 


2.8704 


C 


DC-3 


1,032 


3.0141 


ND 


ND 


2,138 


3.3302 


C 


NDC-.5-3 


7,285 


3.8625 


6,362 


3.8037 


19,395 


4.2877 


C 


SDC-.5-3 


2,573 


3.4106 


8,410 


3.9248 


1,593 


3.2025 


C 


NDC-1-3 


2,808 


3.4486 


10,985 


4.0408 


7,376 


3.8679 


C 


SDC-1-3 


1,793 


3.2538 

Outer 


12,652 
- 1970 


4.1022 


18,708 


4.2721 


A 


NDC-2-1 





0.0000 


696 


2.8432 


877 


2.9435 


A 


SDC-2-1 


677 


2.8312 


138 


2.1430 


138 


2.1430 


A 


NDC-2-2 


ND 


ND 


1,0^'^ 


3.0175 


3,669 


3.5647 


A 


NDC-4-1 


295 


2.4713 


60 


1.7853 


68 


1.8388 


A 


SDC-4-1 


103 


2.0170 


354 


2.5502 


16 


1.2304 


B 


SDC-2-2 


1,268 


3.1035 


1,780 


3.2507 


2,502 


3.3985 


B 


m)C-2-3 


1,319 


3.1206 


1,385 


3.1418 


7,762 


3.8900 


B 


SDC-2-3 


10,509 


4.0216 


8,953 


3.9520 


16,961 


4.2295 


B 


NDC-4-2 


1,172 


3.0693 


172 


2.2380 


686 


2.8370 


B 


SDC-4-2 


1,207 


3.0821 


1,475 


3.1691 


1,971 


3.2949 


C 


NDC-2-4 


16,789 


4.2251 


6,363 


3.8037 


18,248 


4.2612 


C 


SDC-2-4 


9,118 


3.9599 


12,657 


4.1024 


519 


2.7160 


C 


NDC-4-3 


2,128 


3.3282 


1,041 


3.0179 


4,033 


3.6057 


C 


SDC-4-3 


4,466 


3.6500 


15,335 


4.1857 


5,267 


3.7216 


C 


SDC-7-4 


6,554 


3.8166 


26,716 


4.4268 


2,945 


3.4692 



(cont . ) 



16 



TABLE 6 (cont.) 









Outer 


- 1971 












Jul 


Z 


Sept. 


Nov. 


Zone 


Stations 
NDC-2-1 


X 


y 


X 


V 


X 


y 


A 


561 


2.7497 


290 


2.4639 


90 


1.9590 


A 


SDC-2-1 


416 


2.6201 


217 


2.3385 


543 


2.7356 


A 


NDC-2-2 


1,249 


3.0969 


706 


2.8494 


778 


2.8915 


A 


NDC-4-1 


416 


2.6201 


289 


2.4624 





0.0000 


A 


SDC-4-1 


ND 


ND 


181 


2.2601 


144 


2.1614 


B 


SDC-2-2 


2,445 


3.3884 


2,825 


3.4512 


1,085 


3.0358 


B 


NDC-2-3 


5,311 


3.7253 


1,377 


3.1392 


740 


2.8698 


B 


SDC-2-3 


1,576 


3.1978 


5,582 


3.7469 


3,822 


3.5824 


B 


NDC-4-2 


144 


2.1614 


343 


2.5366 


543 


2.7356 


B 


SDC-4-2 


1,448 


3.1611 


2,717 


3.4342 


397 


2.5999 


C 


NDC-2-4 


29,912 


4.4759 


28,154 


4.4496 


11,837 


4.0733 


C 


SDC-2-4 


15,155 


4.1806 


15,862 


4.2004 


15,099 


4.1790 


C 


NDC-4-3 


1,684 


3.2266 


4,622 


3.6649 


506 


2.7050 


C 


SDC-4-3 


12,906 


4.1108 


10,458 


4.0195 


8,554 


3.9322 


C 


SDC-7-4 


14,611 


4.1647 


4,441 


3.6476 


ND 


ND 



chosen from the available stations that lay on the seven innermost tran- 
sects (SDC-1, SDC-.5, DC, NDC-.5, NDC-1) . The five outer stations of 
each zone were chosen from the remaining six transects. The stations in 
each group provide five replicate values for each cell of a four-way 
analysis of variance. The sample taken at each station was the combined 
contents of two grabs. (It would have been preferable if the two grabs 
had been counted separately.) To make the error variance of these data 
comparable to that obtained from the random survey, it is necessary to 
multiply it by two. (The variance of the mean of two observations draxm 
at random from some distribution is half the variance of the distribution 
itself). 



17 



The results of the four-way ANOVA are shown in Table 7. Note that the 
error variance is 0.28464. When this is multiplied by two it becomes 0.56928, 
which is quite comparable to the value obtained from the random survey, 
0.49527. However, the latter value is the one we will use in later calcula- 
tions; it will be referred to as the sampling error. 

Estimation of the Interaction Variance ^/gV 



To obtain a value for (51^ , data from a number of years must be consid- 
ered. Table 2 shows the surveys that are available for this purpose. The 
more years that can be used, the better the estimate. It is also desirable 
to include as many seasons as possible from each year. The arrangement we 
have chosen is as follows year 1 — July/70, Nov. /70, April/71; year 2~ July/71, 
NOV./71, April/72; year 3 — July /7 2, Oct./72, April/73. All data in years 1 
and 2 were taken with the grid survey; the data in year 3 were taken with 
the random survey. 

The fact that both grid and random data were to be considered in one 
ANOVA caused some difficulties. First, the areas sampled by the "outer" 
station groups are not the same (though they are remote from the outfall in 
both cases); second, the individual data values are not exactly comparable. 
The latter is because the datum at each station of the grid survey was the 
mean of two grabs. The grabs were physically combined before counting, so 
the individual grab counts are unavailable. Moreover, the random survey 
data consist of the results of three one- third grab casts at each station. 
The mean of these three may or may not be comparable to what would have been 
obtained with one cast of a whole grab, since the results from the three 
chambers might be correlated. 

This situation was handled with a compromise method. We computed the 



18 



TABLE 7, Estimation of the Sampling Error in the Grid Survey. 

The method used was a four-way factorial analysis of variance of six selected 
grid surveys. The data were total animals per meter 2, transformed through 
y = log(x+l). 



Factors : 

1. Outfall distance(2): 

2. Year(2): 

3. Depth zone (3): 

4. Month (3): 



inner, outer 

1970, 1971 

A (0-8m), B (8-16m), C'(l6-24m) 

July, September, November 



Within each cell, five individual stations were used as replicates. 

The sample taken at each station was the combined contents of two grabs, 



Source of 
Variation 



Degrees of 
freedom 



Sums of 



Mean 



1. 


Outfall distances 


1 


2. 


Years 


1 


3. 


Depth zones 


2 


A. 


Months 


2 




12 


1 




13 


2 




lA 


2 




23 


2 




2A 


2 




3A 


4 




123 


2 




12A 


2 




13A 


A 




23A 


A 




1234 


4 




Error 


13 5# 



Total 



170 



squares 


squares 


F-ratio 


.64175 


.64175 


2.56 


.00446 


. 00446 


.02 


54.44033 


27.22017 


95.63*' 


2.24850 


1.12425 


3.95 


.14829 


.14829 


.52 


.85552 


.42776 


1.50 


.19843 


.09922 


.35 


.25941 


.12970 


.46 


1.95730 


.97865 


3.44* 


.74026 


.18507 


.65 


.A196A 


.20982 


.74 


.33970 


.16985 


.60 


.86831 


.21708 


.76 


2.86123 


.71531 


2.51* 


.94894 


.23723 


.83 


38.42679 


.28464 




L05. 35886 







* Significant, p<.05 
** Significant, p<.01 



,05(1,125) 



= 3.92, 



.05(2,125) 



= 3.07, F 



.05(4,125) 



= 2.44 



The numerator sums of squares and the error degrees of freedom have been 
adjusted for the nine missing data values by the method of Winer (1971, 
p. 402). 



19 



zone means for each type of survey, and ran an ANOVA on the means rather than 
on the individual station values. This method is unable to give a value for 
the error variance, but we already have an estimate of that from the preced- 
ing section. The zone means for the grid survey were obtained using two 
stations, so that each datum is the mean of four grabs. The zone means for 
the random survey were obtained using three stations, so that each datum is 
either the mean of three whole grabs (zone 0) or the mean of nine one-third 
grabs (zones 1 and 2) . Ideally we might have performed some kind of weighted- 
means analysis, since the area sampled is not exactly the same in all cells, 
but it was decided to do a regular ANOVA. The layout and results of this 
appear in Table 8. The data that were analyzed are in Table 5. 

As before. Table 8 includes the results of the usual significance tests, 
but they are not of prime interest. The quantity we are concerned about is 
the mean square for the interaction of year and outfall distance. MS = 
.255085 for that interaction. To interpret the value, it is necessary to 
write down a model equation for the appropriate ANOVA and derive some 
expected mean squares. 

At this point a complication occurs, since we want to be able to compare 
^y with the previously obtained sampling error. (The notation << . means 
"an estimate of ^y"-) It is necessary to write down an ANOVA design that 
includes three-fold replication even though the analysis given in Table 8 
does not. The mean squares in Table 8 can be corrected to the values they 
would have assumed with three-fold replication by multiplying each of them 
by three. This can be justified by noting that all sum of squares formulas 
for a factorial ANOVA, except the total and the error sums of squares, 
involve summing first over replicates before doing any squaring or any 
other summations, and that such formulas also require dividing through by 



20 



2. 

TABLE 8. Estimation of the Interaction Variance 6'^^ . The method used 
was a four-way mixed model factorial analysis of variance of nine surveys, 
six grid and three random. (See Table 5 for the data analyzed.) The 
quantity analyzed was total animals per square meter, transformed through 
y = log(x+l). The entry in each cell was the zone mean of the transformed 
variable, so there was in effect just one replicate per cell. Note the 
column in the table that indicates which mean square was used as the de- 
nominator of F. Table 9 provides the list of expected 



Factor 



Name 



No, of Levels Type 



Y Year 






3 


random 






D Outfall distance 


2 


fixed 






Z Depth 


zone 




3 


fixed 






S Season 




3 


fixed 






Source of 






Svms of 


Mean 


Denom. 




Variation 




df 


Squares 


Squares 


of F 


F-ratio 


Year 




2 


2.61669 


1.30835 


YDZS 


18.13** 


Outfall distance 




1 


.00996 


.00996 


YD 


.04 


Depth zone 




2 


23.41074 


11.70537 


YZ 


215.53** 


Season 




2 


.98412 


.49206 


YS 


3.08 


Year x outfall dist 


:ance 


2 


.51017 


.25508 


YDZS 


3.53 


Year x zone 




4 


.21724 


.05431 


YDZS 


.75 


Year x season 




4 


.64002 


.16000 


YDZS 


2.22 


Outfall distance x 


zone 


2 


.42986 


.21493 


YDZ 


2.27 


Outfall distance x 


season 


2 


.96634 


.48317 


YDS 


6.61 


Zone X season 




4 


.62625 


.15656 


YZS 


1.20 


Y X D X Z 




4 


.37900 


.09475 


YDZS 


1.31 


Y X D X S 




4 


.29222 


.07305 


YDZS 


1.01 


Y X Z X S 




8 


1.04335 


.13402 


YDZS 


1.81 


D X Z X S 




4 


.26 505 


.06626 


YDZS 


.92 


Y X D X Z X S 




8 
53 


.57740 


.07217 




— 


Total 


32.96841 





F.05(l,2) = 18.51 
F.05(2,4)= 6.94 
^.01(2,4) = 18.00 



^.05(2,8) = '^'^^ 
^.05(4,8) = 3.84 
^.05(8,8) = 3-^^ 



P. 01(2, 8) = 8.65 



21 



the number of replicates. Thus, if two ANOVAs have identical cell means, and 
one has 1 replicate while the other has n, then the mean squares of the second 
are larger by a factor n^/n, or n. The exception is the error MS, which the 
first ANOVA lacks because it is unreplicated. 

A suitable three-fold replicated analog to the Table 8 ANOVA is shown 
in Table 9, and is designated as Design B. This is a mixed model, since the 
year factor is random and the other three are fixed. We make year a random 
factor because we are not interested in the three individual year means 
but in the population of years from which those three are a sample. We use 
a model with three replicates for convenience, although the means analyzed 
in Table 8 are sometimes based on the area of three grabs and sometimes on 
the area of four grabs, as discussed earlier. 

Table 9 contains a line from which d^ay can be estimated. It is the 
Y X D interaction (year x outfall distance) , whose expected mean square is 
6t ^ XI (fj' ' The MS for year x outfall distance in Table 8 is .255085. 
This should be multiplied by three as dis'^ussed above; the result is .76526. 
The sampling error, (f. , was estimated in the last section as .49527. 
Hence, an estimate of 62y is (.76526 - .49527)/27 = .01000. 

2. 

Although it is permissible to estimate ^y in this way, it should be 
noted that zero in another reasonable estimate. Let us test the null hypo- 
thesis that (Sa^ =0. If this is true, then the two quantities found above, 
.76526 and .49527, would be estimates of the same population variance O^ 
The first of these comes from a mean square with 2 d.f., the second from a 
mean square with 36 d.f. Fg = .765267.49527 = 1.55. From a table, 

F ^. (2,36) = 3.26. Thus, the hypothesis that ^^, = cannot be rejected 
.05 py 

-2. 

using the available data. On the other hand, the hypothesis that d^y 

= .01000 cannot be rejected either. If this were true, then we would have: 



22 



TABLE 9. Lineop Model and Expected Mean Squares for Design By a Factorial 
Analysis of Variance. 






Greek 
Factor Symbol 



Y 
D 
S 
Z 






Name 

Year 

Outfall distance 

Season 

Depth zone 



No. of 
Levels 

3 
2 
3 
3 



Txee 

random 
fixed 
fixed 
fixed 



This is a mixed model ANOVA, since factor Y is random while the others 
are fixed. 



Source 


of 


Degrees of 


variation 


freedom 


Y 




2 


D 




1 


S 




2 


Z 




2 


YD 




2 


YS 




4 


YZ 




4 


DS 




2 


DZ 




2 


SZ 




4 



Expected mean square 



23 



TABLE 9 (cont.) 



Source of Degrees of 
Variation Freedom 



YDS 


4 


YDZ 


4 


YSZ 


8 


DSZ 


4 


YDSZ 


8 


Error 


see 




text 



Expected Mean Square 



•^i"""^ ^ ^^y^% 



^/ 



MS^j^ = .76526 = (f^^+ 27 (.01000) = ^^ + .27000 . 
This can be solved to give an estimate of .49526 for (f^ . Fg = .495267.49527 = 
1.00, which is not significant. We will use the estimate O = .01000. 
The present estimate of 6n^ is based on nly 2 degrees of freedom (the 
number of years of data minus one). When more years of data are available, 
its value can be assigned with more confidence. To be prepared for the 
possibility that it is non-zero is the more conservative course, and it is 
the one that we favor. 

Justification for a non-zero estimate of an added variance component in 
a case where it is not significant is given by Sokal and Rohlf (1969, p. 
265). 

Calculation of the Least Detectable True Change 

It is now possible to calculate the smallest true change detectable using 
the eight-year design (Design A), with a 5% significance level and 95% 



24 



power. The error standard deviation needed was previously found to be 
(f ^ V ^€ + ;i7 ^ ay . Estimates of the two quantities under the radi- 

cal sign are now available: 6^ = .49527, based on 36 degrees of freedom, 

^ I. 
and S^-j = .01000, based on 2 degrees of freedom. These lead to 

Cf = ^.76526 ' = .87479. 

A further step is to compute a ^ value from this (5 . A quantity ^ 

was presented earlier as a least detectable true difference where two 

treatment means in an ANOVA were to be compared. In the present case we are 

computing the power of an F-test of an interaction, and some algebra is 

required to show how the E given by Sokal and Rohlf 's equation can be 

applied here. 

The first step is to present some equations of Cohen (1969). Using 

his own measure for the root mean square amplitude of a true interaction 

effect, ^^ , he gives the following equation (p. 365): 



<^x- 






This applies to a two-way interaction in a two-factor analysis of variance, 

having r rows (levels of the first factor) and c columns (levels of the 

second factor), x . is the interaction effect in cell i j , and it is given 

by: x^ . = m^ . - m. -m . + m (m, , is the population mean [true mean] of 
^ ij ij i- •! ij 

cell i j ; m, is that of row i; m . is that of column j ; m is the grand 
1. -J 

population mean). In his notation ^ ' \^ , where f = d/2 for a compari- 
son involving a single degree of freedom (p. 269). His d is identical with 
our (J/tJ* , so ^ is just S/2. The formula for ff given above can be re- 
written in our own notation aa: 




^==^" *fM 



25 



where df are the degrees of freedom for the interaction. 

The next question is: how can the above formula for a two-factor ANOVA 
be applied in the present qase (i.e. in Design A) which uses a five-factor 
ANOVA? The answer is, that for calculating x.. and the number of observations 
per cell (n) the other three factors should be "collapsed" so that a two- 
factor ANOVA results. Design Aisa2x4x2x3x3 problem, with 3 
replicates, so that there are 144 cells and a total of 432 observations. By 
collapsing the month, zone and year factors we are left with a two-factor 
problem with four cells: construction time (2) x outfall distance (2). The 
mean of each cell, when the complete data are available in 1979, will be 
the mean of the transformed animal densities over all months, zones and 
years for stations with the appropriate outfall distance and construction 
time. Since 432 observations will be divided equally into four groups, 
there will be a total of 108 observations per cell. According to Cohen, 
"... the n which governs the power of an R x C interaction test is the cell 
n ..." (p. 367). Hence we set n = 108 in the Sokal and Rohlf formula for 

b . 

Derivation of a Least Detectable True Ratio in Terms of Q 

What is the relationship of the four x . values in the collapsed 2 x 
2 ANOVA? Since interaction effects must sum to zero in each row and column, 
there is only a single degree of freedom in this case. Each interaction 
must equal either x or -x, where x is the common absolute value of the four 

re 

interactions. Thus ^ x. .2 = 4x2. Putting df = 1 in the Cohen formula 



26 



presented earlier, we obtain ^ = 2 y 4x^/2 = 2)fYx. Thus x = 6/2(?, 
The next step is to relate ^ to the untransformed animal densities. 
Let the value of the true row main effect in the first row be f ; let the 
value of the true column main effect in the first column be g; m is the true 
grand mean. The number of animals per square meter in each cell is then 
as shown below: 



Inner 



Outer 



Before 



After 






^-^4 tW oc 



^ - r 4 '} ^ ^ 






0^-10 ^ -I 



Ig is the true mean number of animals per square meter at the inner stations 
before plant operation; I^ is the mean at the inner stations after operation 
starts; Og is the mean at the outer stations before operation; 0. is the 
mean at the outer stations after operation starts. Ratios involving those 
mean numbers of animals can be set up as follows. 



Ib+ 1 



^^nH-f+g-x __ ^^2g-2x 



Og + 1 10' 



,nri-f-g4x 






lo'""'-^-^ ^ 1028 + 2x 



10' 



m-f-g-x 



In our design we want to compare the preoperational ratio with the operation- 



al ratio. We do so by forming a new ratio R: 

2h ♦ 7ic 



R- 



'Ia-^ I 




Te^> 



10 ^ 



2^-2V 



10 



f-^ 



It was found earlier that x = ^/I'fl when It Is just detectable with P = .95. 



27 



Thus, if R is to be detectable it must obey the following inequality: 

Q was previously called the least detectable true difference of the transform- 
ed variable. We now designate the quantity jQ as the "least detectable 
true ratio." The test from which the Sokal and Rohlf formula comes is 
two- tailed. This means that a true population change is detectable with 
P = .95 either if f^ ^ /O or f^ ^ /O • (Our test can pick 
up either a relative increase or a relative decrease of the inner populations.) 
To find a numerical estimate of R for the benthos survey, we now obtain a 
value for q . 

Evaluation of ^ Using the Variance Determined Previously 

Values required by the Sokal and Rohlf formula for ^ that have been 

A 
determined in the preceding sections are: c?( = .05, P = . 95, c^ = .87479 and 

n = 108. The quantity still needed is V , the degree of freedom for the 

denominator of the F-test. Table 1 indicates that V= 6 for our chosen 

denominator, which is MS^_^. It was prevx^usly mentioned that the numerator, 

MS-,-, has a single degree of freedom. If it had more than one, then the 

Sokal and Rohlf formula could not be used. The five relevant quantities can 

now be substituted in the formula: 

^ X . 522 do 



28 



Thus, the least detectable true ratio R = 5.48A. Benthos at the inner 
stations must undergo a true 5.5-fold increase or decrease relative to the 
outer stations in order to be detected by the eight-year survey. 



REFERENCES 



Ayers, J. C, W. L. Yocum, H. K. Soo, T. W. Bottrell, S. C. Mozley and L. C. 
Garcia (1971) : Benton Harbor Power Plant Limnological Studies , part 9 
"The biological survey of 10 July 1970." Great Lakes Research Division, 
University of Michigan, Ann Arbor, Michigan. 

Beak Consultants (1973): "Evaluation of pre-operational biological conditions 
of Lake Michigan in the vicinity of the Palisades nuclear plant." A 
report to Consumers Power Co., Jackson, Michigan, March 2, 1973. 

Cohen, Jacob (1969): Statistical Power Analysis for the Behavioral Sciences . 
Academic Press, New York. 415 p. 

Elliott, J. M. (1971); Some Methods for the Statistical Analysis of 

Samples of Benthic Invertebrates . Freshwater Biological Association, 
Scientific Publication No. 25, 144 p. 

Johnston, Edward M. (1973): "Effect of a thermal discharge on benthos 
populations: statistical methods for assessing the impact of the 
Cook nuclear plant." Benton Harbor Power Plant Limnological Studies , 
part 18, Great Lakes Research Division, University of Michigan, Ann 
Arbor, Michigan 20 p. 

Kirk, Roger E. (1968): Experimental Design; Procedures for the Behavioral 
Sciences . Brooks/Cole, Belmont, California. 577 p. 

Mozley, Samuel C. (1973): "Study of benthic prganisms." In J. C. Ayers 
and E. Seibel (ed.), Benton Harbor Power Plant Limnological Studies , 
part 13, "Cook Plant pre-operational studies 1972," p. 178-242. 
Great Lakes Research Division, University of Michigan, Ann Arbor, Michi- 
gan. 

Myers, Jerome L. (1972): Fundamentals of Experimental Design . Second 
edition. Allyn and Bacon, Boston. 465 p. 

Scheffe, Henry (1959): The Analysis of Variance . Wiley, New York, 477 p. 

Sokal, Robert R. and F. James Rohlf (1969): Biometry . W. H. Freeman, San 
Francisco. 776 p. 

Winer, B. J. (1971): Statistical Principles in Experimental Design . Second 
edition. McGraw-Hill, New York. 



29 



